Overview

Brought to you by YData

Dataset statistics

Number of variables25
Number of observations11401
Missing cells46890
Missing cells (%)16.5%
Total size in memory2.0 MiB
Average record size in memory186.0 B

Variable types

Numeric7
Text16
Boolean2

Alerts

IS_PLATFORM_USER has constant value "False"Constant
IS_BAD_USER is highly imbalanced (99.6%)Imbalance
STARTDATE has 1865 (16.4%) missing valuesMissing
ENDDATE has 2059 (18.1%) missing valuesMissing
DEGREE_RAW has 1941 (17.0%) missing valuesMissing
FIELD_RAW has 2937 (25.8%) missing valuesMissing
RSID has 1088 (9.5%) missing valuesMissing
ULTIMATE_PARENT_RSID has 1119 (9.8%) missing valuesMissing
SCHOOL_PRESTIGE has 3022 (26.5%) missing valuesMissing
CAMPUS_COUNTRY has 2576 (22.6%) missing valuesMissing
LOCATION_COUNTRY has 6084 (53.4%) missing valuesMissing
UNIVERSITY_COUNTRY has 6498 (57.0%) missing valuesMissing
UNIVERSITYURL has 2104 (18.5%) missing valuesMissing
UNIVERSITYURI has 2065 (18.1%) missing valuesMissing
UNIVERSITY_LOCATION has 6076 (53.3%) missing valuesMissing
EDUCATION_DESCRIPTION has 7453 (65.4%) missing valuesMissing
DEGREE_LEVEL has 3768 (33.0%) zerosZeros

Reproduction

Analysis started2025-09-30 07:18:54.664685
Analysis finished2025-09-30 07:18:55.696919
Duration1.03 second
Software versionydata-profiling vv4.17.0
Download configurationconfig.json

Variables

USER_ID
Real number (ℝ)

Distinct4521
Distinct (%)39.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean577473961.9
Minimum1241390
Maximum2225886059
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size89.2 KiB
2025-09-30T03:18:55.913116image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1241390
5-th percentile38152157
Q1214379847
median437381605
Q3677072734
95-th percentile2061401536
Maximum2225886059
Range2224644669
Interquartile range (IQR)462692887

Descriptive statistics

Standard deviation554367505.1
Coefficient of variation (CV)0.9599870153
Kurtosis2.504441975
Mean577473961.9
Median Absolute Deviation (MAD)233244069
Skewness1.779461698
Sum6.58378064 × 1012
Variance3.073233308 × 1017
MonotonicityNot monotonic
2025-09-30T03:18:56.111666image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
28004328539
 
0.3%
106194161319
 
0.2%
13732187913
 
0.1%
88964429813
 
0.1%
208084001713
 
0.1%
105652707413
 
0.1%
4715433912
 
0.1%
44302691811
 
0.1%
3734067111
 
0.1%
38202655211
 
0.1%
Other values (4511)11246
98.6%
ValueCountFrequency (%)
12413901
 
< 0.1%
12498863
< 0.1%
12843283
< 0.1%
14902444
< 0.1%
15113592
< 0.1%
ValueCountFrequency (%)
22258860593
< 0.1%
22257476442
< 0.1%
22255996363
< 0.1%
22252510203
< 0.1%
22242465121
 
< 0.1%

SCHOOL
Text

Distinct4284
Distinct (%)37.6%
Missing1
Missing (%)< 0.1%
Memory size89.2 KiB
2025-09-30T03:18:56.514087image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length132
Median length87
Mean length27.54508772
Min length1

Characters and Unicode

Total characters314014
Distinct characters129
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2981 ?
Unique (%)26.1%

Sample

1st rowEast Carolina University
2nd rowThe State University of New York at Canton
3rd rowThe University of Tennessee Health Science Center
4th rowUniversity of Nebraska Medical Center
5th rowNorthwestern University
ValueCountFrequency (%)
university5977
 
13.6%
of4118
 
9.4%
school2758
 
6.3%
college1652
 
3.8%
new1000
 
2.3%
the971
 
2.2%
high806
 
1.8%
york748
 
1.7%
582
 
1.3%
institute567
 
1.3%
Other values (4277)24747
56.3%
2025-09-30T03:18:57.142385image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
32548
 
10.4%
e26341
 
8.4%
i25469
 
8.1%
o23091
 
7.4%
n20525
 
6.5%
t17476
 
5.6%
r16554
 
5.3%
a15346
 
4.9%
s14818
 
4.7%
l13468
 
4.3%
Other values (119)108378
34.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)314014
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
32548
 
10.4%
e26341
 
8.4%
i25469
 
8.1%
o23091
 
7.4%
n20525
 
6.5%
t17476
 
5.6%
r16554
 
5.3%
a15346
 
4.9%
s14818
 
4.7%
l13468
 
4.3%
Other values (119)108378
34.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)314014
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
32548
 
10.4%
e26341
 
8.4%
i25469
 
8.1%
o23091
 
7.4%
n20525
 
6.5%
t17476
 
5.6%
r16554
 
5.3%
a15346
 
4.9%
s14818
 
4.7%
l13468
 
4.3%
Other values (119)108378
34.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)314014
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
32548
 
10.4%
e26341
 
8.4%
i25469
 
8.1%
o23091
 
7.4%
n20525
 
6.5%
t17476
 
5.6%
r16554
 
5.3%
a15346
 
4.9%
s14818
 
4.7%
l13468
 
4.3%
Other values (119)108378
34.5%

STARTDATE
Text

Missing 

Distinct110
Distinct (%)1.2%
Missing1865
Missing (%)16.4%
Memory size89.2 KiB
2025-09-30T03:18:57.429015image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.000314597
Min length8

Characters and Unicode

Total characters76291
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique34 ?
Unique (%)0.4%

Sample

1st row2023/1/1
2nd row2016/1/1
3rd row2013/1/1
4th row2001/1/1
5th row1994/1/1
ValueCountFrequency (%)
2016/1/1497
 
5.2%
2017/1/1487
 
5.1%
2015/1/1478
 
5.0%
2014/1/1466
 
4.9%
2018/1/1437
 
4.6%
2012/1/1423
 
4.4%
2013/1/1417
 
4.4%
2019/1/1414
 
4.3%
2011/1/1392
 
4.1%
2020/1/1386
 
4.0%
Other values (100)5139
53.9%
2025-09-30T03:18:57.885143image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
125922
34.0%
/19072
25.0%
011136
14.6%
210410
13.6%
93434
 
4.5%
81392
 
1.8%
71070
 
1.4%
4997
 
1.3%
6984
 
1.3%
3943
 
1.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)76291
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
125922
34.0%
/19072
25.0%
011136
14.6%
210410
13.6%
93434
 
4.5%
81392
 
1.8%
71070
 
1.4%
4997
 
1.3%
6984
 
1.3%
3943
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)76291
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
125922
34.0%
/19072
25.0%
011136
14.6%
210410
13.6%
93434
 
4.5%
81392
 
1.8%
71070
 
1.4%
4997
 
1.3%
6984
 
1.3%
3943
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)76291
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
125922
34.0%
/19072
25.0%
011136
14.6%
210410
13.6%
93434
 
4.5%
81392
 
1.8%
71070
 
1.4%
4997
 
1.3%
6984
 
1.3%
3943
 
1.2%

ENDDATE
Text

Missing 

Distinct183
Distinct (%)2.0%
Missing2059
Missing (%)18.1%
Memory size89.2 KiB
2025-09-30T03:18:58.338307image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length8.376792978
Min length8

Characters and Unicode

Total characters78256
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)0.5%

Sample

1st row2026/1/1
2nd row2021/1/31
3rd row2019/1/31
4th row2005/1/1
5th row1998/1/1
ValueCountFrequency (%)
2019/1/1316
 
3.4%
2020/1/1307
 
3.3%
2021/1/1301
 
3.2%
2018/1/1287
 
3.1%
2017/1/1286
 
3.1%
2016/1/1283
 
3.0%
2015/1/1265
 
2.8%
2023/1/1247
 
2.6%
2022/1/1232
 
2.5%
2012/1/1232
 
2.5%
Other values (173)6586
70.5%
2025-09-30T03:18:58.946253image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
124785
31.7%
/18684
23.9%
211712
15.0%
010681
13.6%
34492
 
5.7%
92849
 
3.6%
81205
 
1.5%
41000
 
1.3%
7971
 
1.2%
5968
 
1.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)78256
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
124785
31.7%
/18684
23.9%
211712
15.0%
010681
13.6%
34492
 
5.7%
92849
 
3.6%
81205
 
1.5%
41000
 
1.3%
7971
 
1.2%
5968
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)78256
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
124785
31.7%
/18684
23.9%
211712
15.0%
010681
13.6%
34492
 
5.7%
92849
 
3.6%
81205
 
1.5%
41000
 
1.3%
7971
 
1.2%
5968
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)78256
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
124785
31.7%
/18684
23.9%
211712
15.0%
010681
13.6%
34492
 
5.7%
92849
 
3.6%
81205
 
1.5%
41000
 
1.3%
7971
 
1.2%
5968
 
1.2%

DEGREE
Text

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size89.2 KiB
2025-09-30T03:18:59.141694image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length11
Median length9
Mean length6.5850364
Min length3

Characters and Unicode

Total characters75076
Distinct characters21
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMaster
2nd rowBachelor
3rd rowDoctor
4th rowDoctor
5th rowDoctor
ValueCountFrequency (%)
bachelor3900
32.4%
empty3768
31.3%
master1585
13.2%
doctor944
 
7.8%
high632
 
5.3%
school632
 
5.3%
mba373
 
3.1%
associate199
 
1.7%
2025-09-30T03:18:59.485638image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e9452
12.6%
o7251
9.7%
t6496
 
8.7%
r6429
 
8.6%
a5684
 
7.6%
c5675
 
7.6%
h5164
 
6.9%
l4532
 
6.0%
B4273
 
5.7%
p3768
 
5.0%
Other values (11)16352
21.8%

Most occurring categories

ValueCountFrequency (%)
(unknown)75076
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e9452
12.6%
o7251
9.7%
t6496
 
8.7%
r6429
 
8.6%
a5684
 
7.6%
c5675
 
7.6%
h5164
 
6.9%
l4532
 
6.0%
B4273
 
5.7%
p3768
 
5.0%
Other values (11)16352
21.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown)75076
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e9452
12.6%
o7251
9.7%
t6496
 
8.7%
r6429
 
8.6%
a5684
 
7.6%
c5675
 
7.6%
h5164
 
6.9%
l4532
 
6.0%
B4273
 
5.7%
p3768
 
5.0%
Other values (11)16352
21.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown)75076
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e9452
12.6%
o7251
9.7%
t6496
 
8.7%
r6429
 
8.6%
a5684
 
7.6%
c5675
 
7.6%
h5164
 
6.9%
l4532
 
6.0%
B4273
 
5.7%
p3768
 
5.0%
Other values (11)16352
21.8%

FIELD
Text

Distinct18
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size89.2 KiB
2025-09-30T03:18:59.657316image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length22
Median length5
Mean length6.43794404
Min length3

Characters and Unicode

Total characters73399
Distinct characters33
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowempty
2nd rowBusiness
3rd rowNursing
4th rowBiology
5th rowempty
ValueCountFrequency (%)
empty6992
61.1%
business1340
 
11.7%
engineering934
 
8.2%
law280
 
2.4%
economics262
 
2.3%
finance199
 
1.7%
biology187
 
1.6%
medicine183
 
1.6%
education182
 
1.6%
marketing153
 
1.3%
Other values (9)723
 
6.3%
2025-09-30T03:19:00.203174image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e11428
15.6%
t8141
11.1%
m7474
10.2%
y7400
10.1%
p6992
9.5%
n5745
7.8%
i5277
7.2%
s4793
6.5%
g2444
 
3.3%
u1869
 
2.5%
Other values (23)11836
16.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)73399
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e11428
15.6%
t8141
11.1%
m7474
10.2%
y7400
10.1%
p6992
9.5%
n5745
7.8%
i5277
7.2%
s4793
6.5%
g2444
 
3.3%
u1869
 
2.5%
Other values (23)11836
16.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)73399
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e11428
15.6%
t8141
11.1%
m7474
10.2%
y7400
10.1%
p6992
9.5%
n5745
7.8%
i5277
7.2%
s4793
6.5%
g2444
 
3.3%
u1869
 
2.5%
Other values (23)11836
16.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)73399
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e11428
15.6%
t8141
11.1%
m7474
10.2%
y7400
10.1%
p6992
9.5%
n5745
7.8%
i5277
7.2%
s4793
6.5%
g2444
 
3.3%
u1869
 
2.5%
Other values (23)11836
16.1%

DEGREE_RAW
Text

Missing 

Distinct2521
Distinct (%)26.6%
Missing1941
Missing (%)17.0%
Memory size89.2 KiB
2025-09-30T03:19:00.545844image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length100
Median length89
Mean length20.92230444
Min length1

Characters and Unicode

Total characters197925
Distinct characters106
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2085 ?
Unique (%)22.0%

Sample

1st rowM.S.
2nd rowBachelor
3rd rowDoctor of Philosophy - PhD
4th rowDoctor of Philosophy (PhD)
5th rowDoctor of Philosophy (Ph.D.)
ValueCountFrequency (%)
of4150
 
13.1%
bachelor2399
 
7.6%
2145
 
6.8%
science1487
 
4.7%
degree1343
 
4.2%
arts1250
 
3.9%
master1143
 
3.6%
ba731
 
2.3%
bachelor's729
 
2.3%
bs671
 
2.1%
Other values (1914)15626
49.3%
2025-09-30T03:19:01.130593image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
22229
 
11.2%
e18368
 
9.3%
o14877
 
7.5%
r12121
 
6.1%
c10655
 
5.4%
a10374
 
5.2%
i10299
 
5.2%
s8891
 
4.5%
t8400
 
4.2%
n7757
 
3.9%
Other values (96)73954
37.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)197925
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
22229
 
11.2%
e18368
 
9.3%
o14877
 
7.5%
r12121
 
6.1%
c10655
 
5.4%
a10374
 
5.2%
i10299
 
5.2%
s8891
 
4.5%
t8400
 
4.2%
n7757
 
3.9%
Other values (96)73954
37.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)197925
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
22229
 
11.2%
e18368
 
9.3%
o14877
 
7.5%
r12121
 
6.1%
c10655
 
5.4%
a10374
 
5.2%
i10299
 
5.2%
s8891
 
4.5%
t8400
 
4.2%
n7757
 
3.9%
Other values (96)73954
37.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)197925
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
22229
 
11.2%
e18368
 
9.3%
o14877
 
7.5%
r12121
 
6.1%
c10655
 
5.4%
a10374
 
5.2%
i10299
 
5.2%
s8891
 
4.5%
t8400
 
4.2%
n7757
 
3.9%
Other values (96)73954
37.4%

FIELD_RAW
Text

Missing 

Distinct4129
Distinct (%)48.8%
Missing2937
Missing (%)25.8%
Memory size89.2 KiB
2025-09-30T03:19:01.498021image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length168
Median length88
Mean length24.81415406
Min length1

Characters and Unicode

Total characters210027
Distinct characters113
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3446 ?
Unique (%)40.7%

Sample

1st rowInstructional Technology
2nd rowBusiness Administration; Management
3rd rowNursing Science
4th rowBiochemistry and Molecular Biology
5th rowGeological and Earth Sciences/Geosciences
ValueCountFrequency (%)
and2005
 
8.1%
science772
 
3.1%
623
 
2.5%
management557
 
2.3%
business539
 
2.2%
engineering516
 
2.1%
studies474
 
1.9%
computer420
 
1.7%
finance372
 
1.5%
economics336
 
1.4%
Other values (2567)18098
73.2%
2025-09-30T03:19:02.095091image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i18862
 
9.0%
n18789
 
8.9%
e17929
 
8.5%
16280
 
7.8%
a14923
 
7.1%
o12328
 
5.9%
t11794
 
5.6%
c10359
 
4.9%
r10016
 
4.8%
s9424
 
4.5%
Other values (103)69323
33.0%

Most occurring categories

ValueCountFrequency (%)
(unknown)210027
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i18862
 
9.0%
n18789
 
8.9%
e17929
 
8.5%
16280
 
7.8%
a14923
 
7.1%
o12328
 
5.9%
t11794
 
5.6%
c10359
 
4.9%
r10016
 
4.8%
s9424
 
4.5%
Other values (103)69323
33.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown)210027
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i18862
 
9.0%
n18789
 
8.9%
e17929
 
8.5%
16280
 
7.8%
a14923
 
7.1%
o12328
 
5.9%
t11794
 
5.6%
c10359
 
4.9%
r10016
 
4.8%
s9424
 
4.5%
Other values (103)69323
33.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown)210027
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i18862
 
9.0%
n18789
 
8.9%
e17929
 
8.5%
16280
 
7.8%
a14923
 
7.1%
o12328
 
5.9%
t11794
 
5.6%
c10359
 
4.9%
r10016
 
4.8%
s9424
 
4.5%
Other values (103)69323
33.0%

RSID
Real number (ℝ)

Missing 

Distinct2929
Distinct (%)28.4%
Missing1088
Missing (%)9.5%
Infinite0
Infinite (%)0.0%
Mean119774.0406
Minimum9
Maximum296932
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size89.2 KiB
2025-09-30T03:19:02.287265image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum9
5-th percentile7837
Q165045
median115965
Q3155147
95-th percentile267350
Maximum296932
Range296923
Interquartile range (IQR)90102

Descriptive statistics

Standard deviation71806.62508
Coefficient of variation (CV)0.5995174306
Kurtosis-0.1483616846
Mean119774.0406
Median Absolute Deviation (MAD)41405
Skewness0.5624067994
Sum1235229681
Variance5156191406
MonotonicityNot monotonic
2025-09-30T03:19:02.479434image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
98025300
 
2.6%
46455231
 
2.0%
47013101
 
0.9%
10757490
 
0.8%
11634187
 
0.8%
6115687
 
0.8%
6631081
 
0.7%
16388972
 
0.6%
10798668
 
0.6%
11964667
 
0.6%
Other values (2919)9129
80.1%
(Missing)1088
 
9.5%
ValueCountFrequency (%)
927
0.2%
371
 
< 0.1%
441
 
< 0.1%
581
 
< 0.1%
662
 
< 0.1%
ValueCountFrequency (%)
2969321
< 0.1%
2942911
< 0.1%
2942881
< 0.1%
2942411
< 0.1%
2942101
< 0.1%

ULTIMATE_PARENT_RSID
Real number (ℝ)

Missing 

Distinct2612
Distinct (%)25.4%
Missing1119
Missing (%)9.8%
Infinite0
Infinite (%)0.0%
Mean120285.8274
Minimum9
Maximum296932
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size89.2 KiB
2025-09-30T03:19:02.668918image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum9
5-th percentile22485
Q166310
median115959
Q3154166
95-th percentile265493.9
Maximum296932
Range296923
Interquartile range (IQR)87856

Descriptive statistics

Standard deviation68861.37824
Coefficient of variation (CV)0.5724812286
Kurtosis0.04971963483
Mean120285.8274
Median Absolute Deviation (MAD)39237
Skewness0.6488442488
Sum1236778877
Variance4741889413
MonotonicityNot monotonic
2025-09-30T03:19:02.862564image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
98025448
 
3.9%
46455368
 
3.2%
66310170
 
1.5%
47013144
 
1.3%
107573139
 
1.2%
155147116
 
1.0%
61766108
 
0.9%
163889105
 
0.9%
11634192
 
0.8%
6115687
 
0.8%
Other values (2602)8505
74.6%
(Missing)1119
 
9.8%
ValueCountFrequency (%)
935
0.3%
441
 
< 0.1%
662
 
< 0.1%
671
 
< 0.1%
683
 
< 0.1%
ValueCountFrequency (%)
2969321
< 0.1%
2942911
< 0.1%
2942881
< 0.1%
2942101
< 0.1%
2942041
< 0.1%

SCHOOL_PRESTIGE
Real number (ℝ)

Missing 

Distinct1423
Distinct (%)17.0%
Missing3022
Missing (%)26.5%
Infinite0
Infinite (%)0.0%
Mean0.3374215692
Minimum-0.978193998
Maximum0.986606002
Zeros0
Zeros (%)0.0%
Negative3073
Negative (%)27.0%
Memory size89.2 KiB
2025-09-30T03:19:03.061746image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-0.978193998
5-th percentile-0.808278024
Q1-0.564356029
median0.918618023
Q30.960698009
95-th percentile0.968349993
Maximum0.986606002
Range1.9648
Interquartile range (IQR)1.525054038

Descriptive statistics

Standard deviation0.7570980225
Coefficient of variation (CV)2.243774825
Kurtosis-1.615280487
Mean0.3374215692
Median Absolute Deviation (MAD)0.049571991
Skewness-0.5176016895
Sum2827.255329
Variance0.5731974157
MonotonicityNot monotonic
2025-09-30T03:19:03.272161image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.960698009449
 
3.9%
0.963714004379
 
3.3%
0.967827976178
 
1.6%
0.956879973144
 
1.3%
0.940025985131
 
1.1%
0.964698017116
 
1.0%
-0.453042001108
 
0.9%
0.967414022105
 
0.9%
0.92465597492
 
0.8%
-0.63859802587
 
0.8%
Other values (1413)6590
57.8%
(Missing)3022
26.5%
ValueCountFrequency (%)
-0.9781939982
< 0.1%
-0.9740059973
< 0.1%
-0.9694039821
 
< 0.1%
-0.9672660231
 
< 0.1%
-0.9653739932
< 0.1%
ValueCountFrequency (%)
0.9866060022
 
< 0.1%
0.9844560036
 
0.1%
0.98438000715
0.1%
0.9841340182
 
< 0.1%
0.983887976
 
0.1%
Distinct4243
Distinct (%)37.2%
Missing2
Missing (%)< 0.1%
Memory size89.2 KiB
2025-09-30T03:19:03.721656image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length127
Median length88
Mean length27.30116677
Min length2

Characters and Unicode

Total characters311206
Distinct characters81
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2927 ?
Unique (%)25.7%

Sample

1st roweast carolina university
2nd rowthe state university of new york at canton
3rd rowthe university of tennessee health science center
4th rowuniversity of nebraska medical center
5th rownorthwestern university
ValueCountFrequency (%)
university6046
 
13.7%
of4118
 
9.3%
school2762
 
6.3%
college1678
 
3.8%
new1054
 
2.4%
the973
 
2.2%
high806
 
1.8%
york782
 
1.8%
institute569
 
1.3%
business515
 
1.2%
Other values (4205)24753
56.2%
2025-09-30T03:19:04.361374image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
32657
 
10.5%
e27124
 
8.7%
i26987
 
8.7%
o23455
 
7.5%
n22516
 
7.2%
s20551
 
6.6%
t19495
 
6.3%
r17306
 
5.6%
a16922
 
5.4%
l14728
 
4.7%
Other values (71)89465
28.7%

Most occurring categories

ValueCountFrequency (%)
(unknown)311206
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
32657
 
10.5%
e27124
 
8.7%
i26987
 
8.7%
o23455
 
7.5%
n22516
 
7.2%
s20551
 
6.6%
t19495
 
6.3%
r17306
 
5.6%
a16922
 
5.4%
l14728
 
4.7%
Other values (71)89465
28.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown)311206
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
32657
 
10.5%
e27124
 
8.7%
i26987
 
8.7%
o23455
 
7.5%
n22516
 
7.2%
s20551
 
6.6%
t19495
 
6.3%
r17306
 
5.6%
a16922
 
5.4%
l14728
 
4.7%
Other values (71)89465
28.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown)311206
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
32657
 
10.5%
e27124
 
8.7%
i26987
 
8.7%
o23455
 
7.5%
n22516
 
7.2%
s20551
 
6.6%
t19495
 
6.3%
r17306
 
5.6%
a16922
 
5.4%
l14728
 
4.7%
Other values (71)89465
28.7%

CAMPUS_COUNTRY
Text

Missing 

Distinct70
Distinct (%)0.8%
Missing2576
Missing (%)22.6%
Memory size89.2 KiB
2025-09-30T03:19:04.624173image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length20
Median length13
Mean length12.19082153
Min length4

Characters and Unicode

Total characters107584
Distinct characters45
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17 ?
Unique (%)0.2%

Sample

1st rowUnited States
2nd rowUnited States
3rd rowUnited States
4th rowPhilippines
5th rowSweden
ValueCountFrequency (%)
united7640
46.2%
states7410
44.8%
kingdom229
 
1.4%
india192
 
1.2%
canada117
 
0.7%
france69
 
0.4%
italy68
 
0.4%
spain64
 
0.4%
australia61
 
0.4%
china54
 
0.3%
Other values (67)645
 
3.9%
2025-09-30T03:19:05.084815image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t22755
21.2%
e15551
14.5%
a8828
 
8.2%
n8731
 
8.1%
i8576
 
8.0%
d8296
 
7.7%
7724
 
7.2%
s7644
 
7.1%
U7641
 
7.1%
S7583
 
7.0%
Other values (35)4255
 
4.0%

Most occurring categories

ValueCountFrequency (%)
(unknown)107584
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
t22755
21.2%
e15551
14.5%
a8828
 
8.2%
n8731
 
8.1%
i8576
 
8.0%
d8296
 
7.7%
7724
 
7.2%
s7644
 
7.1%
U7641
 
7.1%
S7583
 
7.0%
Other values (35)4255
 
4.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown)107584
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
t22755
21.2%
e15551
14.5%
a8828
 
8.2%
n8731
 
8.1%
i8576
 
8.0%
d8296
 
7.7%
7724
 
7.2%
s7644
 
7.1%
U7641
 
7.1%
S7583
 
7.0%
Other values (35)4255
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown)107584
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
t22755
21.2%
e15551
14.5%
a8828
 
8.2%
n8731
 
8.1%
i8576
 
8.0%
d8296
 
7.7%
7724
 
7.2%
s7644
 
7.1%
U7641
 
7.1%
S7583
 
7.0%
Other values (35)4255
 
4.0%

LOCATION_COUNTRY
Text

Missing 

Distinct62
Distinct (%)1.2%
Missing6084
Missing (%)53.4%
Memory size89.2 KiB
2025-09-30T03:19:05.314945image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length20
Median length13
Mean length12.20199361
Min length4

Characters and Unicode

Total characters64878
Distinct characters45
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)0.3%

Sample

1st rowUnited States
2nd rowUnited States
3rd rowUnited States
4th rowPhilippines
5th rowUnited States
ValueCountFrequency (%)
united4604
46.1%
states4423
44.3%
kingdom180
 
1.8%
china97
 
1.0%
canada94
 
0.9%
india59
 
0.6%
australia44
 
0.4%
south36
 
0.4%
italy33
 
0.3%
israel30
 
0.3%
Other values (59)381
 
3.8%
2025-09-30T03:19:05.758759image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t13623
21.0%
e9239
14.2%
a5384
 
8.3%
n5280
 
8.1%
i5251
 
8.1%
d4977
 
7.7%
4664
 
7.2%
U4605
 
7.1%
s4566
 
7.0%
S4491
 
6.9%
Other values (35)2798
 
4.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)64878
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
t13623
21.0%
e9239
14.2%
a5384
 
8.3%
n5280
 
8.1%
i5251
 
8.1%
d4977
 
7.7%
4664
 
7.2%
U4605
 
7.1%
s4566
 
7.0%
S4491
 
6.9%
Other values (35)2798
 
4.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)64878
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
t13623
21.0%
e9239
14.2%
a5384
 
8.3%
n5280
 
8.1%
i5251
 
8.1%
d4977
 
7.7%
4664
 
7.2%
U4605
 
7.1%
s4566
 
7.0%
S4491
 
6.9%
Other values (35)2798
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)64878
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
t13623
21.0%
e9239
14.2%
a5384
 
8.3%
n5280
 
8.1%
i5251
 
8.1%
d4977
 
7.7%
4664
 
7.2%
U4605
 
7.1%
s4566
 
7.0%
S4491
 
6.9%
Other values (35)2798
 
4.3%

UNIVERSITY_COUNTRY
Text

Missing 

Distinct69
Distinct (%)1.4%
Missing6498
Missing (%)57.0%
Memory size89.2 KiB
2025-09-30T03:19:06.017100image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length21
Median length13
Mean length11.83805833
Min length4

Characters and Unicode

Total characters58042
Distinct characters46
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)0.3%

Sample

1st rowUnited States
2nd rowUnited States
3rd rowUnited States
4th rowUnited States
5th rowUnited States
ValueCountFrequency (%)
united3991
44.6%
states3893
43.5%
china108
 
1.2%
kingdom97
 
1.1%
india97
 
1.1%
canada70
 
0.8%
australia42
 
0.5%
france40
 
0.4%
spain37
 
0.4%
italy36
 
0.4%
Other values (71)542
 
6.1%
2025-09-30T03:19:06.443774image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t11964
20.6%
e8230
14.2%
a5064
8.7%
n4769
 
8.2%
i4673
 
8.1%
d4325
 
7.5%
4050
 
7.0%
s4038
 
7.0%
U3996
 
6.9%
S3963
 
6.8%
Other values (36)2970
 
5.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)58042
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
t11964
20.6%
e8230
14.2%
a5064
8.7%
n4769
 
8.2%
i4673
 
8.1%
d4325
 
7.5%
4050
 
7.0%
s4038
 
7.0%
U3996
 
6.9%
S3963
 
6.8%
Other values (36)2970
 
5.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)58042
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
t11964
20.6%
e8230
14.2%
a5064
8.7%
n4769
 
8.2%
i4673
 
8.1%
d4325
 
7.5%
4050
 
7.0%
s4038
 
7.0%
U3996
 
6.9%
S3963
 
6.8%
Other values (36)2970
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)58042
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
t11964
20.6%
e8230
14.2%
a5064
8.7%
n4769
 
8.2%
i4673
 
8.1%
d4325
 
7.5%
4050
 
7.0%
s4038
 
7.0%
U3996
 
6.9%
S3963
 
6.8%
Other values (36)2970
 
5.1%

IS_BAD_USER
Boolean

Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.3 KiB
False
11398 
True
 
3
ValueCountFrequency (%)
False11398
> 99.9%
True3
 
< 0.1%
2025-09-30T03:19:06.595970image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

IS_PLATFORM_USER
Boolean

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size11.3 KiB
False
11401 
ValueCountFrequency (%)
False11401
100.0%
2025-09-30T03:19:06.701415image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

URI
Text

Distinct4521
Distinct (%)39.7%
Missing0
Missing (%)0.0%
Memory size89.2 KiB
2025-09-30T03:19:07.036333image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length87
Median length62
Mean length33.93167266
Min length19

Characters and Unicode

Total characters386855
Distinct characters60
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1031 ?
Unique (%)9.0%

Sample

1st rowlinkedin.com/in/bscobb
2nd rowlinkedin.com/in/kashifrivers
3rd rowlinkedin.com/in/ulanda-marcus-aiyeku-dnp-pmhnp-bc-ne-bc-840352124
4th rowlinkedin.com/in/pranita-atri-76490773
5th rowlinkedin.com/in/eddie-brooks-0b01289b
ValueCountFrequency (%)
linkedin.com/in/hasantimucinozdemir39
 
0.3%
linkedin.com/in/jillchasse19
 
0.2%
linkedin.com/in/vesteragerm13
 
0.1%
linkedin.com/in/chloe-luterman13
 
0.1%
linkedin.com/in/dominic-desapio-07774a6213
 
0.1%
linkedin.com/in/andraya-yearwood-9b385918213
 
0.1%
linkedin.com/in/oliverknesl12
 
0.1%
linkedin.com/in/louis-l-nock-83a0173411
 
0.1%
linkedin.com/in/andraya-y-9b385918211
 
0.1%
linkedin.com/in/pietro-nardella-dellova-aa25042a11
 
0.1%
Other values (4511)11246
98.6%
2025-09-30T03:19:07.732914image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n46027
 
11.9%
i45395
 
11.7%
e25134
 
6.5%
a22865
 
5.9%
/22802
 
5.9%
l20094
 
5.2%
o19272
 
5.0%
m17120
 
4.4%
c16264
 
4.2%
d16018
 
4.1%
Other values (50)135864
35.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)386855
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n46027
 
11.9%
i45395
 
11.7%
e25134
 
6.5%
a22865
 
5.9%
/22802
 
5.9%
l20094
 
5.2%
o19272
 
5.0%
m17120
 
4.4%
c16264
 
4.2%
d16018
 
4.1%
Other values (50)135864
35.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)386855
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n46027
 
11.9%
i45395
 
11.7%
e25134
 
6.5%
a22865
 
5.9%
/22802
 
5.9%
l20094
 
5.2%
o19272
 
5.0%
m17120
 
4.4%
c16264
 
4.2%
d16018
 
4.1%
Other values (50)135864
35.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)386855
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n46027
 
11.9%
i45395
 
11.7%
e25134
 
6.5%
a22865
 
5.9%
/22802
 
5.9%
l20094
 
5.2%
o19272
 
5.0%
m17120
 
4.4%
c16264
 
4.2%
d16018
 
4.1%
Other values (50)135864
35.1%

EDUCATION_ID
Real number (ℝ)

Distinct11380
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-9.216409477 × 1015
Minimum-9.22169 × 1018
Maximum9.222084882 × 1018
Zeros0
Zeros (%)0.0%
Negative5730
Negative (%)50.3%
Memory size89.2 KiB
2025-09-30T03:19:08.119577image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-9.22169 × 1018
5-th percentile-8.33183 × 1018
Q1-4.69769 × 1018
median-3.12178 × 1016
Q34.630562234 × 1018
95-th percentile8.342237691 × 1018
Maximum9.222084882 × 1018
Range1.844377488 × 1019
Interquartile range (IQR)9.328252234 × 1018

Descriptive statistics

Standard deviation5.350778188 × 1018
Coefficient of variation (CV)-580.5707962
Kurtosis-1.202570239
Mean-9.216409477 × 1015
Median Absolute Deviation (MAD)4.663510893 × 1018
Skewness0.007905324779
Sum-1.050762845 × 1020
Variance2.863082722 × 1037
MonotonicityNot monotonic
2025-09-30T03:19:08.348648image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1.20821 × 10182
 
< 0.1%
-5.18579 × 10182
 
< 0.1%
-3.76894 × 10182
 
< 0.1%
-7.9522 × 10182
 
< 0.1%
-3.95649 × 10182
 
< 0.1%
-5.9951 × 10182
 
< 0.1%
-2.50361 × 10182
 
< 0.1%
-8.62531 × 10182
 
< 0.1%
-8.90762 × 10182
 
< 0.1%
-8.34148 × 10182
 
< 0.1%
Other values (11370)11381
99.8%
ValueCountFrequency (%)
-9.22169 × 10181
< 0.1%
-9.2214 × 10181
< 0.1%
-9.22038 × 10181
< 0.1%
-9.21985 × 10181
< 0.1%
-9.2193 × 10181
< 0.1%
ValueCountFrequency (%)
9.222084882 × 10181
< 0.1%
9.221546001 × 10181
< 0.1%
9.219007644 × 10181
< 0.1%
9.217667956 × 10181
< 0.1%
9.217205312 × 10181
< 0.1%

DEGREE_LEVEL
Real number (ℝ)

Zeros 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.333040961
Minimum0
Maximum6
Zeros3768
Zeros (%)33.0%
Negative0
Negative (%)0.0%
Memory size89.2 KiB
2025-09-30T03:19:08.527011image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median3
Q34
95-th percentile6
Maximum6
Range6
Interquartile range (IQR)4

Descriptive statistics

Standard deviation1.945314737
Coefficient of variation (CV)0.8338107944
Kurtosis-1.062920488
Mean2.333040961
Median Absolute Deviation (MAD)2
Skewness0.1521594592
Sum26599
Variance3.784249427
MonotonicityNot monotonic
2025-09-30T03:19:08.669606image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
33900
34.2%
03768
33.0%
41585
13.9%
6944
 
8.3%
1632
 
5.5%
5373
 
3.3%
2199
 
1.7%
ValueCountFrequency (%)
03768
33.0%
1632
 
5.5%
2199
 
1.7%
33900
34.2%
41585
13.9%
ValueCountFrequency (%)
6944
 
8.3%
5373
 
3.3%
41585
13.9%
33900
34.2%
2199
 
1.7%

SEQUENCENO
Real number (ℝ)

Distinct39
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.241557758
Minimum1
Maximum39
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size89.2 KiB
2025-09-30T03:19:08.861902image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile5
Maximum39
Range38
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.927564365
Coefficient of variation (CV)0.8599217922
Kurtosis93.63555805
Mean2.241557758
Median Absolute Deviation (MAD)1
Skewness7.01398027
Sum25556
Variance3.71550438
MonotonicityNot monotonic
2025-09-30T03:19:09.076408image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
14509
39.5%
23458
30.3%
31863
16.3%
4816
 
7.2%
5354
 
3.1%
6160
 
1.4%
787
 
0.8%
847
 
0.4%
930
 
0.3%
1018
 
0.2%
Other values (29)59
 
0.5%
ValueCountFrequency (%)
14509
39.5%
23458
30.3%
31863
16.3%
4816
 
7.2%
5354
 
3.1%
ValueCountFrequency (%)
391
< 0.1%
381
< 0.1%
371
< 0.1%
361
< 0.1%
351
< 0.1%

UNIVERSITYURL
Text

Missing 

Distinct3371
Distinct (%)36.3%
Missing2104
Missing (%)18.5%
Memory size89.2 KiB
2025-09-30T03:19:09.509691image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length338
Median length266
Mean length48.26632247
Min length36

Characters and Unicode

Total characters448732
Distinct characters50
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2068 ?
Unique (%)22.2%

Sample

1st rowhttps://www.linkedin.com/company/7522/
2nd rowhttps://www.linkedin.com/company/230771/
3rd rowhttps://www.linkedin.com/school/northwestern-university/
4th rowhttps://www.linkedin.com/company/5077/
5th rowhttps://www.linkedin.com/school/university-of-gothenburg/
ValueCountFrequency (%)
https://www.linkedin.com/school/new-york-university159
 
1.7%
https://www.linkedin.com/school/columbia-university150
 
1.6%
https://www.linkedin.com/company/3159117
 
1.3%
https://www.linkedin.com/company/262482
 
0.9%
https://www.linkedin.com/school/cornell-university63
 
0.7%
https://www.linkedin.com/school/fashion-institute-of-technology51
 
0.5%
https://www.linkedin.com/company/426251
 
0.5%
https://www.linkedin.com/company/720148
 
0.5%
https://www.linkedin.com/school/yale-university44
 
0.5%
https://www.linkedin.com/company/733836
 
0.4%
Other values (3361)8496
91.4%
2025-09-30T03:19:10.200369image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/46484
 
10.4%
o31827
 
7.1%
n31156
 
6.9%
w28886
 
6.4%
i28502
 
6.4%
t25415
 
5.7%
c22802
 
5.1%
s21948
 
4.9%
l19906
 
4.4%
e18998
 
4.2%
Other values (40)172808
38.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)448732
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
/46484
 
10.4%
o31827
 
7.1%
n31156
 
6.9%
w28886
 
6.4%
i28502
 
6.4%
t25415
 
5.7%
c22802
 
5.1%
s21948
 
4.9%
l19906
 
4.4%
e18998
 
4.2%
Other values (40)172808
38.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)448732
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
/46484
 
10.4%
o31827
 
7.1%
n31156
 
6.9%
w28886
 
6.4%
i28502
 
6.4%
t25415
 
5.7%
c22802
 
5.1%
s21948
 
4.9%
l19906
 
4.4%
e18998
 
4.2%
Other values (40)172808
38.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)448732
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
/46484
 
10.4%
o31827
 
7.1%
n31156
 
6.9%
w28886
 
6.4%
i28502
 
6.4%
t25415
 
5.7%
c22802
 
5.1%
s21948
 
4.9%
l19906
 
4.4%
e18998
 
4.2%
Other values (40)172808
38.5%

UNIVERSITYURI
Text

Missing 

Distinct3403
Distinct (%)36.5%
Missing2065
Missing (%)18.1%
Memory size89.2 KiB
2025-09-30T03:19:10.714556image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length325
Median length253
Mean length35.30676949
Min length23

Characters and Unicode

Total characters329624
Distinct characters49
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2094 ?
Unique (%)22.4%

Sample

1st rowlinkedin.com/company/7522
2nd rowlinkedin.com/company/230771
3rd rowlinkedin.com/school/northwestern-university
4th rowlinkedin.com/company/5077
5th rowlinkedin.com/school/university-of-gothenburg
ValueCountFrequency (%)
linkedin.com/school/new-york-university161
 
1.7%
linkedin.com/school/columbia-university150
 
1.6%
linkedin.com/company/3159117
 
1.3%
linkedin.com/company/262482
 
0.9%
linkedin.com/school/cornell-university65
 
0.7%
linkedin.com/company/426251
 
0.5%
linkedin.com/school/fashion-institute-of-technology51
 
0.5%
linkedin.com/company/720148
 
0.5%
linkedin.com/school/yale-university45
 
0.5%
linkedin.com/school/harvard-university37
 
0.4%
Other values (3371)8529
91.4%
2025-09-30T03:19:11.463098image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o32016
 
9.7%
n31306
 
9.5%
i28660
 
8.7%
c22915
 
7.0%
l20040
 
6.1%
e19121
 
5.8%
/18711
 
5.7%
m15524
 
4.7%
s12743
 
3.9%
d11095
 
3.4%
Other values (39)117493
35.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)329624
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o32016
 
9.7%
n31306
 
9.5%
i28660
 
8.7%
c22915
 
7.0%
l20040
 
6.1%
e19121
 
5.8%
/18711
 
5.7%
m15524
 
4.7%
s12743
 
3.9%
d11095
 
3.4%
Other values (39)117493
35.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)329624
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o32016
 
9.7%
n31306
 
9.5%
i28660
 
8.7%
c22915
 
7.0%
l20040
 
6.1%
e19121
 
5.8%
/18711
 
5.7%
m15524
 
4.7%
s12743
 
3.9%
d11095
 
3.4%
Other values (39)117493
35.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)329624
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o32016
 
9.7%
n31306
 
9.5%
i28660
 
8.7%
c22915
 
7.0%
l20040
 
6.1%
e19121
 
5.8%
/18711
 
5.7%
m15524
 
4.7%
s12743
 
3.9%
d11095
 
3.4%
Other values (39)117493
35.6%

UNIVERSITY_LOCATION
Text

Missing 

Distinct847
Distinct (%)15.9%
Missing6076
Missing (%)53.3%
Memory size89.2 KiB
2025-09-30T03:19:11.992161image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length93
Median length46
Mean length14.75173709
Min length3

Characters and Unicode

Total characters78553
Distinct characters95
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique342 ?
Unique (%)6.4%

Sample

1st rowGreenville, NC
2nd rowMemphis, Tennessee
3rd rowOmaha, Nebraska
4th rowLaoag, Ilocos Norte
5th rowNew York, NY
ValueCountFrequency (%)
ny1326
 
10.4%
new1090
 
8.6%
york897
 
7.0%
nj325
 
2.6%
pa246
 
1.9%
ca212
 
1.7%
ma210
 
1.6%
chicago145
 
1.1%
massachusetts139
 
1.1%
cambridge134
 
1.1%
Other values (983)8014
62.9%
2025-09-30T03:19:12.645199image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7414
 
9.4%
e5353
 
6.8%
a5214
 
6.6%
,5182
 
6.6%
n5002
 
6.4%
o5002
 
6.4%
r3986
 
5.1%
i3599
 
4.6%
t2955
 
3.8%
N2765
 
3.5%
Other values (85)32081
40.8%

Most occurring categories

ValueCountFrequency (%)
(unknown)78553
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
7414
 
9.4%
e5353
 
6.8%
a5214
 
6.6%
,5182
 
6.6%
n5002
 
6.4%
o5002
 
6.4%
r3986
 
5.1%
i3599
 
4.6%
t2955
 
3.8%
N2765
 
3.5%
Other values (85)32081
40.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown)78553
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
7414
 
9.4%
e5353
 
6.8%
a5214
 
6.6%
,5182
 
6.6%
n5002
 
6.4%
o5002
 
6.4%
r3986
 
5.1%
i3599
 
4.6%
t2955
 
3.8%
N2765
 
3.5%
Other values (85)32081
40.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown)78553
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
7414
 
9.4%
e5353
 
6.8%
a5214
 
6.6%
,5182
 
6.6%
n5002
 
6.4%
o5002
 
6.4%
r3986
 
5.1%
i3599
 
4.6%
t2955
 
3.8%
N2765
 
3.5%
Other values (85)32081
40.8%

EDUCATION_DESCRIPTION
Text

Missing 

Distinct3827
Distinct (%)96.9%
Missing7453
Missing (%)65.4%
Memory size89.2 KiB
2025-09-30T03:19:13.022391image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length1590
Median length726
Mean length168.9898683
Min length3

Characters and Unicode

Total characters667172
Distinct characters124
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3763 ?
Unique (%)95.3%

Sample

1st rowActivities and Societies: The Golden Key International Honour Society
2nd rowActivities and Societies: President, Graduate Student Association (2020-2021) Graduate Student Senator, UNMC student senate (2020-2021) Vice President, Graduate Student Association (2019-2020) Treasurer, International Student Association (2018-2019)
3rd rowPh.D. research explores earthquake hazard maps, how to assess their performance, and measuring uncertainties in their calculations. We hope to address the questions of why maps sometimes fail with disastrous consequences (such as Tohoku 2011, Haiti 2011, or Nepal 2015), and suggest ways to improve their generation and performance.
4th rowActivities and Societies: A member of the Presidential Scholars
5th rowCoursework covers univariate and multivariate analysis. Additional focus has been placed on machine learning, time series analysis, computing and experimental design.
ValueCountFrequency (%)
and4689
 
5.1%
of2602
 
2.8%
the2386
 
2.6%
2383
 
2.6%
in2111
 
2.3%
activities1389
 
1.5%
societies1363
 
1.5%
for978
 
1.1%
to975
 
1.1%
a951
 
1.0%
Other values (12545)72512
78.5%
2025-09-30T03:19:13.767275image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
88250
 
13.2%
e56286
 
8.4%
i47919
 
7.2%
a42809
 
6.4%
t41487
 
6.2%
n41260
 
6.2%
o37837
 
5.7%
r33015
 
4.9%
s31756
 
4.8%
c21919
 
3.3%
Other values (114)224634
33.7%

Most occurring categories

ValueCountFrequency (%)
(unknown)667172
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
88250
 
13.2%
e56286
 
8.4%
i47919
 
7.2%
a42809
 
6.4%
t41487
 
6.2%
n41260
 
6.2%
o37837
 
5.7%
r33015
 
4.9%
s31756
 
4.8%
c21919
 
3.3%
Other values (114)224634
33.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown)667172
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
88250
 
13.2%
e56286
 
8.4%
i47919
 
7.2%
a42809
 
6.4%
t41487
 
6.2%
n41260
 
6.2%
o37837
 
5.7%
r33015
 
4.9%
s31756
 
4.8%
c21919
 
3.3%
Other values (114)224634
33.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown)667172
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
88250
 
13.2%
e56286
 
8.4%
i47919
 
7.2%
a42809
 
6.4%
t41487
 
6.2%
n41260
 
6.2%
o37837
 
5.7%
r33015
 
4.9%
s31756
 
4.8%
c21919
 
3.3%
Other values (114)224634
33.7%